Mining Frequent Itemsets Using Re-Usable Data Structure

نویسندگان

Mohamed Yakout

Alaaeldin M. Hafez

Hussein Aly

چکیده

Several algorithms have been introduced for mining frequent itemsets. The recent datasettransformation approach suffers either from the possible increasing in the number of structures that could be produced through the execution of the algorithm or from the problem of the processing time in either projecting or decomposing the datasets. Moreover, the constructed structure cannot be re-used in ad-hoc mining queries or in other mining processes. In this paper, the ItemSet Tree (IST) structure is used in effectively counting the itemsets' support to overcome the above limitations. To speedup the support counting process, a proposal for using a Guidance Information Bits and tree size reduction is presented. The TDF algorithm will be proposed to find all the frequent itemsets. TDF explores the frequent itemsets search space in depth-first to generate candidates from the search space and count their support in the IST. Several experiments have been conducted to study the performance of the TDF algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for High Average-utility Itemset Mining

High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...

متن کامل

Mining maximal frequent itemsets from data streams

Frequent pattern mining from data streams is an active research topic in data mining. Existing research efforts often rely on a two-phase framework to discover frequent patterns: (1) using internal data structures to store meta-patterns obtained by scanning the stream data; and (2) re-mining the meta-patterns to finalize and output frequent patterns. The defectiveness of such a two-phase framew...

متن کامل

Data sanitization in association rule mining based on impact factor

Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...

متن کامل

روشی کارا برای کاوش مجموعه اقلام پرتکرار در تحلیل داده‌های سبد خرید

Discovery of hidden and valuable knowledge from large data warehouses is an important research area and has attracted the attention of many researchers in recent years. Most of Association Rule Mining (ARM) algorithms start by searching for frequent itemsets by scanning the whole database repeatedly and enumerating the occurrences of each candidate itemset. In data mining problems, the size of ...

متن کامل

Mining Frequent Closed Itemsets with the Frequent Pattern List

The mining of the complete set of frequent itemsets will lead to a huge number of itemsets. Fortunately, this problem can be reduced to the mining of frequent closed itemsets (FCIs), which results in a much smaller number of itemsets. The approaches to mining frequent closed itemsets can be categorized into two groups: those with candidate generation and those without. In this paper, we propose...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Mining Frequent Itemsets Using Re-Usable Data Structure

نویسندگان

چکیده

منابع مشابه

A New Algorithm for High Average-utility Itemset Mining

Mining maximal frequent itemsets from data streams

Data sanitization in association rule mining based on impact factor

روشی کارا برای کاوش مجموعه اقلام پرتکرار در تحلیل داده‌های سبد خرید

Mining Frequent Closed Itemsets with the Frequent Pattern List

عنوان ژورنال:

اشتراک گذاری